Based on the code from the Github repo, you are getting the dimension mismatch because for the encoder he uses the https://www.tensorflow.org/api_docs/python/tf/nn/bidirectional_dynamic_rnn. Hence, the cell state that is being passed to the decoder has both the cell states from the cell_fw and cell_bw concatenated together. That makes the cell state size of the encoder to be [batch_size, 2 * encoder_hidden_units]. Now, because the cell state size of both the encoder and the decoder have to be the same in order to set the initial state of the decoder to the final state of the encoder, the encoder_hidden_units have to be twice the size of decoder_hidden_units.
|